Weighted Automata Algorithms

نویسنده

  • Mehryar Mohri
چکیده

This chapter presents several fundamental algorithms for weighted automata and transducers. While the mathematical counterparts of weighted transducers, rational power series , have been extensively studied in the past [22, 54, 13, 36], several essential weighted transducer algorithms, e.g., composition, determinization, minimization, have been devised only in the last decade [38, 43], in part motivated by novel applications in speech recognition, speech synthesis, machine translation, other areas of natural language processing, image processing, optical character recognition, and more recently machine learning. These algorithms can be viewed as the generalization to the weighted transducer case of the standard algorithms for unweighted acceptors. However, this generalization is often not straightforward and has required a number of specific studies either because the old schema could not be applied in the presence of weights and a novel technique was required, as in the case of composition [50, 46], or because of the analysis of the conditions of application of an algorithm as in the case of determinization [38, 3]. The chapter favors a presentation of weighted automata and transducers in terms of graphs, the natural concepts for an algorithmic description and complexity analysis. Also, while power series lead to more concise and rigorous proofs in most cases [36], proofs related to questions of ambiguity naturally require the introduction of paths and reasoning on graph concepts.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient Computation of the Relative Entropy of Probabilistic Automata

The problem of the efficient computation of the relative entropy of two distributions represented by deterministic weighted automata arises in several machine learning problems. We show that this problem can be naturally formulated as a shortest-distance problem over an intersection automaton defined on an appropriate semiring. We describe simple and efficient novel algorithms for its computati...

متن کامل

A Unified Construction of the Glushkov, Follow, and Antimirov Automata (TR2006-880)

Many techniques have been introduced in the last few decades to create -free automata representing regular expressions: Glushkov automata, the so-called follow automata, and Antimirov automata. This paper presents a simple and unified view of all these -free automata both in the case of unweighted and weighted regular expressions. It describes simple and general algorithms with running time com...

متن کامل

Series, Weighted Automata, Probabilistic Automata and Probability Distributions for Unranked Trees

We study tree series and weighted tree automata over unranked trees. The message is that recognizable tree series for unranked trees can be defined and studied from recognizable tree series for binary representations of unranked trees. For this we prove results of [1] as follows. We extend hedge automata – a class of tree automata for unranked trees – to weighted hedge automata. We define weigh...

متن کامل

Weighted Finite-State Transducer Algorithms An Overview

Weighted finite-state transducers are used in many applications such as text, speech and image processing. This chapter gives an overview of several recent weighted transducer algorithms, including composition of weighted transducers, determinization of weighted automata, a weight pushing algorithm, and minimization of weighted automata. It briefly describes these algorithms, discusses their ru...

متن کامل

Edit-Distance Of Weighted Automata: General Definitions And Algorithms

The problem of computing the similarity between two sequences arises in many areas such as computational biology and natural language processing. A common measure of the similarity of two strings is their edit-distance, that is the minimal cost of a series of symbol insertions, deletions, or substitutions transforming one string into the other. In several applications such as speech recognition...

متن کامل

Embedding Multi-Hemirings into Semirings

Weighted languages and weighted automata over multi-hemirings are capable of expressing quantitative properties of systems such as its average or discounted costs. We study the applicability of well-known semiring theory and algorithms in the field of multi-hemirings. To this end, we embed multi-hemirings into semirings and extend this embedding for weighted languages and weighted automata. We ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009